Taking advantage of Wikipedia in Natural Language Processing
نویسندگان
چکیده
Wikipedia is an online encyclopedia created on the web by various participants. Although it is not created for the purpose of helping studies in language processing, its size and well-formed structure is attracting many researchers in the area. In this review, we selected five characteristic papers to show various creative uses of Wikipedia within the three years.
منابع مشابه
WRPA: A System for Relational Paraphrase Acquisition from Wikipedia
In this paper we present WRPA, a system for Relational Paraphrase Acquisition from Wikipedia. WRPA extracts paraphrasing patterns that hold a particular relation between two entities taking advantage of Wikipedia structure. What is new in this system is that Wikipedia’s exploitation goes beyond infoboxes, reaching itemized information embedded in Wikipedia pages. WRPA is language independent, a...
متن کاملDisentangling the Wikipedia Category Graph for Corpus Extraction
In several areas of research such as knowledge management and natural language processing, domain-specific corpora are required for tasks such as terminology extraction and ontology learning. The presented investigations herein are based on the assumption that Wikipedia can be used for the purpose of corpus extraction. It presents the advantage of possessing a semantic layer, which should ease ...
متن کاملWRPA: A System for Relational Paraphrase Acquisition from Wikipedia∗ WRPA: Un sistema para la adquisición de paráfrasis de relaciones de la Wikipedia
In this paper we present WRPA, a system for Relational Paraphrase Acquisition from Wikipedia. WRPA extracts paraphrasing patterns that hold a particular relation between two entities taking advantage of Wikipedia structure. What is new in this system is that Wikipedia’s exploitation goes beyond infoboxes, reaching itemized information embedded in Wikipedia pages. WRPA is language independent, a...
متن کاملAdvertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles
When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...
متن کاملDiscriminative Learning with Natural Annotations: Word Segmentation as a Case Study
Structural information in web text provides natural annotations for NLP problems such as word segmentation and parsing. In this paper we propose a discriminative learning algorithm to take advantage of the linguistic knowledge in large amounts of natural annotations on the Internet. It utilizes the Internet as an external corpus with massive (although slight and sparse) natural annotations, and...
متن کامل